Improving language-universal feature extraction with deep maxout and convolutional neural networks

نویسندگان

  • Yajie Miao
  • Florian Metze
چکیده

When deployed in automated speech recognition (ASR), deep neural networks (DNNs) can be treated as a complex feature extractor plus a simple linear classifier. Previous work has investigated the utility of multilingual DNNs acting as language-universal feature extractors (LUFEs). In this paper, we explore different strategies to further improve LUFEs. First, we replace the standard sigmoid nonlinearity with the recently proposed maxout units. The resulting maxout LUFEs have the nice property of generating sparse feature representations. Second, the convolutional neural network (CNN) architecture is applied to obtain more invariant feature space. We evaluate the performance of LUFEs on a cross-language ASR task. Each of the proposed techniques results in word error rate reduction compared with the existing DNN-based LUFEs. Combining the two methods together brings additional improvement on the target language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving deep convolutional neural networks with mixed maxout units

Motivated by insights from the maxout-units-based deep Convolutional Neural Network (CNN) that "non-maximal features are unable to deliver" and "feature mapping subspace pooling is insufficient," we present a novel mixed variant of the recently introduced maxout unit called a mixout unit. Specifically, we do so by calculating the exponential probabilities of feature mappings gained by applying ...

متن کامل

Convolutional deep maxout networks for phone recognition

Convolutional neural networks have recently been shown to outperform fully connected deep neural networks on several speech recognition tasks. Their superior performance is due to their convolutional structure that processes several, slightly shifted versions of the input window using the same weights, and then pools the resulting neural activations. This pooling operation makes the network les...

متن کامل

Introducing a method for extracting features from facial images based on applying transformations to features obtained from convolutional neural networks

In pattern recognition, features are denoting some measurable characteristics of an observed phenomenon and feature extraction is the procedure of measuring these characteristics. A set of features can be expressed by a feature vector which is used as the input data of a system. An efficient feature extraction method can improve the performance of a machine learning system such as face recognit...

متن کامل

Improving Deep Neural Networks with Probabilistic Maxout Units

We present a probabilistic variant of the recently introduced maxout unit. The success of deep neural networks utilizing maxout can partly be attributed to favorable performance under dropout, when compared to rectified linear units. It however also depends on the fact that each maxout unit performs a pooling operation over a group of linear transformations and is thus partially invariant to ch...

متن کامل

DeepPainter: Painter Classification Using Deep Convolutional Autoencoders

In this paper we describe the problem of painter classification, and propose a novel approach based on deep convolutional autoencoder neural networks. While previous approaches relied on image processing and manual feature extraction from paintings, our approach operates on the raw pixel level, without any preprocessing or manual feature extraction. We first train a deep convolutional autoencod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014